Test case reduction for code regression testing

ABSTRACT

In at least one embodiment, a system performs regression testing of software using selected test cases. In at least one embodiment, the system selects the test case for regression testing based on whether the test case correlates with modified code. In at least one embodiment, a test case correlates with the modified code if the test case tests all or a proper subset of the modified code. In at least one embodiment, if a test case does not test any of the modified code, then the test case is not used in the regression testing of the modified code.

CROSS-REFERENCE TO RELATED APPLICATION

This application claims the benefit under 35 U.S.C. § 119(e) and 37C.F.R. § 1.78 of U.S. Provisional Application No. 61/794,260, filed Mar.15, 2013, which is incorporated by reference in its entirety.

BACKGROUND OF THE INVENTION Field of the Invention

The present invention relates in general to the field of electronics,and more specifically to computer software testing and selectivelyexecuting test cases to reduce overall test case execution.

Description of the Related Art

Testers often test computer software using a suite of test cases. Testcases are also often referred to as test scripts. The test cases ofteninclude a set of conditions or variables, and the tester tests thecomputer software in accordance with the test cases. Generally, the testcase provides a known input with an expected output, and the testerdetermines if the computer software passes or fails based on whether thecomputer software respectively produced the expected output or anotheroutput.

Testers test computer software when certain conditions occur. Anexemplary condition is when computer software is modified either throughthe addition of new code or the revision of existing code. Regressiontesting tests the computer software in accordance with test cases todetermine whether the modified computer software produces the correctpredetermined behavior despite the changes made to the code. Sincecomputer software frequently change during the process of developmentfor initial and new versions, regression testing also often occursfrequently. Any product that invests in exhaustive regression testinginvariably has a very large automated regression test suite of testcases. The objective of this test suite is that it can be executed withevery code modification to identify potential regressions, i.e. errors.

FIG. 1 depicts computer software development and testing system 100,which performs exhaustive test case regression testing. A softwaredeveloper utilizes computer 102 to develop code 104. From computer 102,the developer performs a code check-in 106 to check the code 104 intocomputer system 108. The code 104 is all or part of software 110.Computer system 108 includes a source control system 109 that assigns anew revision number to the software 110 to provide version control andstores each version of the software 110 in a file versioning repositorymemory 111 of the computer system 108. The source control system 109sends CHECK_IN data to notify a regression testing system 111 ofcomputer system 113 when code 104 is checked-in. The CHECK_IN dataprovides specific details associated with the code check-in. ExemplaryCHECK_IN data includes:

-   -   Identification of the user who checked in the code 104;    -   The files of code 104 that were checked in;    -   The check in time;    -   The assigned revision number; and    -   Identification of the repository in which the code 104 was        checked in.        The source control system 109 is a software program executed by        the computer system 108, and the regression testing system 111        is a software program executed by the computer system 113.

The regression testing system 111 performs regression testing todetermine whether the code 104 caused any functional or other errors inthe software 110. Regression testing can involve one or more large testsuites that typically have thousands of test cases and can take between2 and 30 minutes each to run. Regression testing system and process 112includes a test suite 114, and test suite 114 includes N test cases,where N is an integer representing the number test cases in test suite114. The regression system and process 112 is incorporated as data andapplications in computer system 108. Regression testing process 116 runseach of the N test cases in test suite 114. For large values of N, suchas greater than 100, when each test case 1 through N runs serially incomputer system 108, the regression test suite 114 may take in excess of48 hours to run. Running regression test suite 114 serially can, thus,take a significant amount of time. Utilizing parallel processing, byadding one or more additional computer systems to computer system 108 torun all or a subset of the test cases in parallel, reduces regressiontesting time but can increase costs significantly. The results data 118contains the results of the regression testing.

Thus, conventional regression testing is costly either in terms of time,cost, or both. However, without regression testing, computer softwarecan produce errors that may not be caught and/or understood until actualdeployment.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention may be better understood, and its numerousobjects, features and advantages made apparent to those skilled in theart by referencing the accompanying drawings. The use of the samereference number throughout the several figures designates a like orsimilar element.

FIG. 1 (labeled prior art) depicts a computer software development andtesting system with exhaustive test case regression testing.

FIG. 2 depicts a computer software development and testing system withselective test case regression testing.

FIG. 3 depicts a test case-to-code correlation process.

FIGS. 4A-4M (collectively referred to as FIG. 4) depict an exemplary,partial code coverage determination result.

FIG. 5 depicts a selective test case regression testing process.

FIG. 6 depicts a block diagram illustrating a network environment.

FIG. 7 depicts an exemplary computer system.

DETAILED DESCRIPTION

In at least one embodiment, a system performs regression testing ofsoftware using selected test cases. In at least one embodiment, thesystem selects the test case for regression testing based on whether thetest case correlates with modified code. In at least one embodiment, atest case correlates with the modified code if the test case tests allor a proper subset of the modified code. In at least one embodiment, ifa test case does not test any of the modified code, then the test caseis not used in the regression testing of the modified code.

In at least one embodiment, the portions of code that correlate witheach test case are identified, and data representing the testcase-to-code correlation is stored. In at least one embodiment, aftermodification of one or more code portions of computer software, thesystem determines which of the one or more portions of code of thecomputer software were modified. The system then accesses the testcase-to-code correlation data and identifies which of the one or moretest cases correlate with the one or more portions of code that weremodified. After the identification, the system tests the computersoftware using the one or more identified test cases and generates testresult data.

FIG. 2 depicts computer software development and testing system 200,which performs selective test case regression testing. The computersoftware development and testing system 200 includes a computer system201, which may be any type of computer system such as a server. Thecomputer system 201 includes a source control system 203 to managechanges in software 204. The source control system 203 can include anytype of source code control application, such as Apache Subversion (SVN)from the Apache Software Foundation. Git from www.git-scm.com, Clearcasefrom International Business Machines of Armonk, N.Y., Integrity from PTCIntegrity with offices in Chicago, Ill., Visual Source Safe by MicrosoftCorp. of Redmond, Wash., and the Polytron Version Control System (PVCS)from Polytron Corp. from Serena Software, Inc. of San Mateo, Calif.

From computer system 216, a developer performs a code check-in 218 tocheck the code 210 into computer system 201. The code 210 is all or partof software 204. The source control system 203 assigns a new revisionnumber to the code 210 to provide version control and stores eachversion of the code 210 in a file versioning repository memory 211 ofthe computer system 201. The source control system 203 sends CHECK_INdata to notify a regression testing system 219 of computer system 202when code 210 is checked-in. The CHECK_IN data provides specific detailsassociated with the code check-in. Exemplary CHECK_IN data includes:

-   -   Identification of the user who checked in the code 210;    -   The files of code 210 that were checked in;    -   The check in time;    -   The assigned revision number; and    -   Identification of the repository in which the code 104 was        checked in.        The source control system 203 is a software program executed by        the computer system 201, and, in at least one embodiment, the        regression testing system 219 is a software program executed by        the computer system 202. In at least one embodiment, the source        control system 203 and the regression testing system 219 are        software programs executed by the same computer system, such as        computer system 201.

The computer system 202 has access to test suite 206. Test suite 206 isstored in a non-transitory computer readable medium such as any type ofmemory device. The bracket 208 indicates that elements below the bracket208 are accessible to and, in at least one embodiment, are part of theregression testing system 219.

In at least one embodiment, the computer software development andtesting system 200 performs selective test case regression testing byidentifying and using test cases in test suite 206 that correlate withthe one or more portions of checked-in modified code 210. In at leastone embodiment, the computer system 202 generates the test case-to-codecorrelation data; however, any computer system that can test thesoftware with the test suite 206 identify test-case-to-code correlationcan generate the test case-to-code correlation data 212.

FIG. 3 depicts a test case-to-code correlation process 300, whichrepresents one process for generating the test case-to-code correlationdata 212. Referring to FIGS. 2 and 3, operation 302 determines test casecode coverage. In other words, operation 302 determines which code insoftware 204 is exercised by each of test cases 1-N. In at least oneembodiment, the computer system 202 utilizes a code coverage application214 to determine which code is exercised by a test case. Exemplary codecoverage applications include EMMA. EMMA is an open source toolkit formeasuring and reporting Java code coverage. In at least one embodiment,when the computer system 202 executes a test case against the software204, the code coverage application 214 determines which code wasexercised by the test case. The granularity of code coverage is a matterof design choice. In at least one embodiment, the code coverageapplication can identify exercised code at one or more levels. Forexample, for Java-based software, the code coverage application canidentify exercised code at line, block, method, class, and/or packagelevels. The level of granularity for each test case 1-N is a matter ofdesign choice. Thus, the granularity between the test cases 1-N can bethe same or can vary between one of more test cases. In a Javaprogramming language context, lines of code makes up a “block” of code,and blocks of code make up a “method”. Multiple methods make up a“class”, and multiple classes make up a “package”. Java is a programminglanguage originally developed by Sun Microsystems of California, USA,which has merged with Oracle Corporation of California, USA.

FIGS. 4A-4M, collectively referred to as FIG. 4, depict an exemplary,partial code coverage determination result by code coverage application214. FIG. 4A represents an overview of code coverage by an exemplaryi^(th) test case_(i). As indicated by table 402, overall, test case_(i)exercised 86% of the classes, 62% of the methods, 56% of the blocks, and58% of the lines of an embodiment of software 204. Table 404 provides amore specific code coverage breakdown by package. For example, in theorg.apache.velocity.app.tools package, there was no code coverage bytest case_(i) on a class, method, block, or line level. Thus, test case,does not correlate to the org.apache.velocity.app.tools package.However, for the package org.apache.velocity.util.introspection, thereis 93% class coverage, 83% method coverage, 72% block coverage, and 81%line coverage.

FIG. 4B depicts code coverage by test case_(i) for classes in thepackage org.apache.velocity.util.introspection. The packageorg.apache.velocity.util.introspection has four classes identified intable 406, i.e. ClassMap, ClassMap$1, ClassMap$CacheMiss, and ClassMap#MethodInfo. Each class except ClassMap$1 has code coverage by testcase_(i).

FIGS. 4C-4M depict the code coverage on a method, block, and line levelfor the class ClassMap. An “E” next to a bracket indicates that thebracketed code was tested by test case_(i). An “E′” next to a bracketindicates that at least part of the bracketed code was tested by testcase_(i). A “NE” next to a bracket indicates that none of the bracketedcode was tested by test case_(i). Lines with delineated between “/**”and “*/” indicated comments, which are not executed.

Referring to FIG. 3, operation 304 of the test case-to-code correlationprocess 300 then stores the correlation between test case_(i) and theexercised code indicated in FIG. 4 in the test case-to-code correlationdata 212. The test case-to-code correlation process 300 repeats for allother test cases 1-N to generate a complete map of test case-to-codecorrelation data 212. In at least one embodiment, the test case-to-codecorrelation process 300 repeats for each new test case added to the testsuite 206 and/or upon addition of a new test suite with one or more newtest cases. As previously indicated, the granularity of the test-to-codecorrelation data 212 is a matter of design choice. For example, withregard to the package org.apache.velocity.util.introspection, thegranularity can be to correlate at the method level so that if any ofthe 3 exercised methods ClassMap, ClassMap$CacheMiss, orClassMap$MethodInfo are modified, then computer system 202 executes testcase_(i). In at least one embodiment, the granularity of thetest-to-code correlation data 212 is at the line level so that computersystem 202 executes test case_(i) only when lines of code identifiedwith an E or E′ are modified.

FIG. 5 depicts a selective test case regression testing process 500.Referring to FIGS. 2 and 5, as previously described, a developerutilizes computer system 216 to modify code 210. In at least oneembodiment, code 210 represents code that is part of software 204. Thedeveloper performs a code check-in 218. Computer system 216 cancommunicate with computer system 201 via any means including via a localor wide area network (such as the Internet). Computer system 216 canalso be part of computer system 201.

Once computer system 201 notifies computer system 202 that code 210 hasbeen checked in, in operation 502, software 204 is ready for regressiontesting by the regression testing system 219. Checked code-test caseanalyzer 220 determines which code in code 210 was modified. Inoperation 504, the checked code-test case analyzer 220 accesses the testcase-to-code correlation data 212, and, in operation 504 identifieswhich of the one or more test cases in test suite 206 correlate with theone or more portions of code that were modified. In at least oneembodiment, operation 504 identifies which of the one or more test casesin test suite 206 correlate with the one or more portions of code thatwere modified by reviewing the test case-to-code correlation data 212 toidentify each test case that tested the modified code. The granularityof the identification process of operation 506 is a matter of designchoice. In at least one embodiment, operation 506 identifies test casesbased on whether a class in code 210 was modified. Operation 506 canalso identify test cases based upon a method, block, or line granularityif the test case-to-code correlation data 212 also supports the level ofgranularity. Additionally, the granularity of identification can bedifferent for different test cases.

In operation 508, test case selector 222 selects the M selected testcases 224, which were identified in operation 506, for regressiontesting software 504. “M” is an integer greater than or equal to 1 andless than of equal to N. In at least one embodiment, M is less than N sothat the number of test cases 224 used by regression testing process 226to test software 204 is less than the number of test cases in test suite206. In operation 510, the computer system 202 performs the regressiontesting process 226 by testing the software 204 with the M test cases224. In operation 512, computer system 202 generates the test resultsdata 228. Computer system 202 provides the test results data 228 toanother computer system, such as computer system 216, for display and/orstorage, stores the test results data 228 in a memory (not shown) foraccess by one or more computer systems, or directly displays the testresults data 228. Additionally, in at least one embodiment, when thetest case selector 222 identifies a particular test case, the test caseselector 222 iterates through the test cases to determine which testcases are prerequisites to the identified particular test case, anddetermines which test cases are prerequisites to the determinedprerequisite test cases, and so on, so that all test cases andprerequisite test cases are selected. Additionally, in at least oneembodiment, the test cases can be divided into subsets and used ondifferent machines to optimize cost, speed, etc.

FIG. 6 depicts a block diagram illustrating a network environment inwhich a computer software development and testing system 200 may bepracticed. Network 602 (e.g. a private wide area network (WAN) or theInternet) includes a number of networked server computer systems604(1)-(N) that are accessible by client computer systems 606(1)-(N),where N is the number of server computer systems connected to thenetwork. Communication between client computer systems 606(1)-(N) andserver computer systems 604(1)-(N) typically occurs over a network, suchas a public switched telephone network over asynchronous digitalsubscriber line (ADSL) telephone lines or high-bandwidth trunks, forexample communications channels providing T1 or OC3 service. Clientcomputer systems 606(1)-(N) typically access server computer systems604(1)-(N) through a service provider, such as an internet serviceprovider (“ISP”) by executing application specific software, commonlyreferred to as a browser, on one of client computer systems 606(1)-(N).

Client computer systems 606(1)-(N) and/or server computer systems604(1)-(N) may be, for example, computer systems of any appropriatedesign, including a mainframe, a mini-computer, a personal computersystem including notebook computers, a wireless, mobile computing device(including personal digital assistants). These computer systems aretypically information handling systems, which are designed to providecomputing power to one or more users, either locally or remotely. Such acomputer system may also include one or a plurality of input/output(“I/O”) devices coupled to the system processor to perform specializedfunctions. Mass storage devices such as hard disks, compact disk (“CD”)drives, digital versatile disk (“DVD”) drives, and magneto-opticaldrives may also be provided, either as an integrated or peripheraldevice. One such example computer system is shown in detail in FIG. 7.

FIG. 7 depicts an exemplary computer system 700. Embodiments of thecomputer software development and testing system 200 can also beimplemented purely in hardware using, for example, field programmablegate arrays or other configurable or hard-wired circuits. Embodiments ofthe computer software development and testing system 200 can beimplemented by a computer system such as a general-purpose computersystem 700 that is configured with software thereby transforming thegeneral-purpose computer system into a specialized machine forperforming the functions set forth in the software. Input user device(s)710, such as a keyboard and/or mouse, are coupled to a bi-directionalsystem bus 718. The input user device(s) 710 are for introducing userinput to the computer system and communicating that user input toprocessor 713. The computer system of FIG. 7 generally also includes avideo memory 714, non-transitory main memory 715 and non-transitory massstorage 709, all coupled to bi-directional system bus 718 along withinput user device(s) 710 and processor 713. The non-transitory massstorage 709 may include both fixed and removable media, such as otheravailable mass storage technology. Bus 718 may contain, for example, 32address lines for addressing video memory 714 or main memory 715. Thesystem bus 718 also includes, for example, an n-bit data bus fortransferring DATA between and among the components, such as CPU 709,main memory 715, video memory 714 and mass storage 709, where “n” is,for example, 32 or 64. Alternatively, multiplex data/address lines maybe used instead of separate data and address lines. Main memory 715 andmass storage 709 represent embodiments of non-transitory, computerreadable media that store software that is executable by a processor.

I/O device(s) 719 may provide connections to peripheral devices, such asa printer, and may also provide a direct connection to a remote servercomputer systems via a telephone link or to the Internet via an ISP. I/Odevice(s) 719 may also include a network interface device to provide adirect connection to a remote server computer systems via a directnetwork link to the Internet via a POP (point of presence). Suchconnection may be made using, for example, wireless techniques,including digital cellular telephone connection, Cellular Digital PacketData (CDPD) connection, digital satellite data connection or the like.Examples of I/O devices include modems, sound and video devices, andspecialized communication devices such as the aforementioned networkinterface.

Computer programs and data are generally stored as instructions and datain mass storage 709 until loaded into main memory 715 for execution.

The processor 713, in one embodiment, is a microprocessor manufacturedby Motorola Inc. of Illinois, Intel Corporation of California, orAdvanced Micro Devices of California. However, any other suitable singleor multiple microprocessors or microcomputers may be utilized. Mainmemory 715 is comprised of dynamic random access memory (DRAM). Videomemory 714 is a dual-ported video random access memory. One port of thevideo memory 714 is coupled to video amplifier 716. The video amplifier716 is used to drive the display 717. Video amplifier 716 is well knownin the art and may be implemented by any suitable means. This circuitryconverts pixel DATA stored in video memory 714 to a raster signalsuitable for use by display 717. Display 717 is a type of monitorsuitable for displaying graphic images.

The computer system described above is for purposes of example only. Thecomputer software development and testing system 200 may be implementedin any type of computer system or programming or processing environment.It is contemplated that the computer software development and testingsystem 200 might be run on a stand-alone computer system, such as theone described above. The computer software development and testingsystem 200 might also be run from a server computer systems system thatcan be accessed by a plurality of client computer systems interconnectedover an intranet network. Finally, the computer software development andtesting system 200 may be run from a server computer system that isaccessible to clients over the Internet.

Embodiments of the computer software development and testing system 200can also be implemented can be implemented purely in hardware using, forexample, field programmable gate arrays or other configurable orhard-wired circuits.

Thus, a system performs regression testing of software using selectedtest cases. In at least one embodiment, the system selects the test casefor regression testing based on whether the test case correlates withmodified code.

Although embodiments have been described in detail, it should beunderstood that various changes, substitutions, and alterations can bemade hereto without departing from the spirit and scope of the inventionas defined by the appended claims.

What is claimed is:
 1. A method of testing computer software aftermodification of one or more portions of code of the computer software,the method comprising: performing by a computer system operationscomprising accessing the code after modification of the multipleportions of the code, which includes modification of multiple classes ofthe code; determining which of the multiple portions of the code,including the modified multiple classes, were modified; accessing datacorrelating one or more test cases to respective portions of the code,wherein each test case that is correlated to a portion of the code teststhe correlated portion of code during a test of the computer software;identifying one or more correlated test cases, wherein the correlatedtest cases are the one or more test cases that correlate with themultiple portions of the code, including the modified multiple classes,that were modified; iterating through multiple test cases to determinewhich of the multiple test cases are prerequisites to the one or moreidentified correlated test cases; continue iterating through themultiple test cases to determine which of the multiple tests cases areprerequisites to the one or more determined prerequisite test cases;repeating the continue iterating element until all prerequisite testcases are determined; and testing the code using the one or morecorrelated test cases and the one or more determined prerequisite testcases.
 2. The method of claim 1 wherein each portion of the portions ofcode consist of one or more members of a group consisting of: a line ofcode, a block of code, a method, a class, and a package.
 3. The methodof claim 1 wherein the data correlating the one or more test cases torespective portions of the code comprises a map that correlates eachtest case with the code exercised by the test case.
 4. The method ofclaim 1 further comprising: further performing by the computer systemoperations comprising: testing the code in accordance with one of thetest cases; exercising a portion of the code with the test case used forthe testing; and recording which portion of the code was exercised bythe test case used for the testing.
 5. The method of claim 4 whereinexercising the portion of the code with the test case used for thetesting comprises: calling one or more methods of the code.
 6. Themethod of claim 4 wherein recording which portion of the code wasexercised by the test case used for the testing comprises: storing theportion of the code exercised by the test case used for testing as adata map in a non-transitory, computer readable medium, wherein the datacorrelating one or more test cases to respective portions of the codecomprises the data map.
 7. The method of claim 1 further comprising: nottesting the code using any test cases that do not correlate with themultiple portions of the code that were modified.
 8. The method of claim1 wherein a test case correlates with the multiple portions of the codethat were modified if the test case tests all or a proper subset of themultiple portions of the code that were modified.
 9. A method thatincludes testing computer software after modification of one or moreportions of code of the computer software, the method comprising:performing by a first computer system programmed with first code storedin a first memory and executable by a first processor of the firstcomputer system: receiving a code testing report for the computersoftware, wherein the code testing report is generated by a secondcomputer system, coupled to the first computer system, executing secondcode stored in a second memory of the second computer system to causethe second computer system to perform: accessing the code aftermodification of the multiple portions of the code, which includesmodification of multiple classes of the code; determining which of themultiple portions of the code, including the modified multiple classes,were modified; accessing data correlating one or more test cases torespective portions of the code, wherein each test case that iscorrelated to a portion of the code tests the correlated portion of codeduring a test of the computer software; identifying one or morecorrelated test cases, wherein the correlated test cases are the one ormore test cases that correlate with the multiple portions of the code,including the modified multiple classes, that were modified; iteratingthrough multiple test cases to determine which of the multiple testcases are prerequisites to the one or more identified correlated testcases; continue iterating through the multiple test cases to determinewhich of the multiple tests cases are prerequisites to the one or moredetermined prerequisite test cases; repeating the continue iteratingelement until all prerequisite test cases are determined; testing thecode using the one or more correlated test cases and the one or moredetermined prerequisite test cases; generating the code testing reportthat includes test results of the testing of the code using the one ormore correlated test cases and the one or more determined prerequisitetest cases; and providing the code testing report to the first computersystem.
 10. The method of claim 9 wherein each portion of the portionsof code consist of one or more members of a group consisting of: a lineof code, a block of code, a method, a class, and a package.
 11. Themethod of claim 9 wherein the data correlating the one or more testcases to respective portions of the code comprises a map that correlateseach test case with the code exercised by the test case.
 12. The methodof claim 9 further comprising: executing the second code stored in thesecond memory of the second computer system to cause the second computersystem to further perform: testing the first code in accordance with oneof the test cases; exercising a portion of the first code with the testcase used for the testing; and recording which portion of the first codewas exercised by the test case used for the testing.
 13. The method ofclaim 12 wherein exercising the portion of the first code with the testcase used for the testing comprises: calling one or more methods of thecode.
 14. The method of claim 12 wherein recording which portion of thecode was exercised by the test case used for the testing comprises:storing the portion of the code exercised by the test case used fortesting as a data map in a non-transitory, computer readable medium,wherein the data correlating one or more test cases to respectiveportions of the code comprises the data map.
 15. The method of claim 12further comprising: executing second code stored in a second memory ofthe second computer system to cause the second computer system to nottest the code using any test cases that do not correlate with themultiple portions of the code that were modified.
 16. The method ofclaim 9 wherein a test case correlates with the multiple portions of thecode that were modified if the test case tests all or a proper subset ofthe multiple portions of the code that were modified.